201 research outputs found

    C-NMT: A Collaborative Inference Framework for Neural Machine Translation

    Get PDF
    Collaborative Inference (CI) optimizes the latency and energy consumption of deep learning inference through the inter-operation of edge and cloud devices. Albeit beneficial for other tasks, CI has never been applied to the sequence-to-sequence mapping problem at the heart of Neural Machine Translation (NMT). In this work, we address the specific issues of collaborative NMT, such as estimating the latency required to generate the (unknown) output sequence, and show how existing CI methods can be adapted to these applications. Our experiments show that CI can reduce the latency of NMT by up to 44% compared to a non-collaborative approach

    A Semi-Empirical Model of PV Modules Including Manufacturing I-V Mismatch

    Get PDF
    This paper presents an analysis of the impact of manufacturing variability in PV modules when interconnected into a large PV panel. The key enabling technology is a compact semiempirical model, that is built solely from information derived from datasheets, without requiring extraction of electrical parameters or measurements. The model explicits the dependency of output power on those quantities that are heavily affected by variability, like short circuit current and open circuit voltage. In this way, variability can be included with Monte Carlo techniques and tuned to the desired distributions and tolerance. In the experimental results, we prove the effectiveness of the model in the analysis of the optimal interconnection of PV modules, with the goal of reducing the impact of variability

    Channel-wise Mixed-precision Assignment for DNN Inference on Constrained Edge Nodes

    Get PDF
    Quantization is widely employed in both cloud and edge systems to reduce the memory occupation, latency, and energy consumption of deep neural networks. In particular, mixed-precision quantization, i.e., the use of different bit-widths for different portions of the network, has been shown to provide excellent efficiency gains with limited accuracy drops, especially with optimized bit-width assignments determined by automated Neural Architecture Search (NAS) tools. State-of-The-Art mixed-precision works layer-wise, i.e., it uses different bit-widths for the weights and activations tensors of each network layer. In this work, we widen the search space, proposing a novel NAS that selects the bit-width of each weight tensor channel independently. This gives the tool the additional flexibility of assigning a higher precision only to the weights associated with the most informative features. Testing on the MLPerf Tiny benchmark suite, we obtain a rich collection of Pareto-optimal models in the accuracy vs model size and accuracy vs energy spaces. When deployed on the MPIC RISC-V edge processor, our networks reduce the memory and energy for inference by up to 63% and 27% respectively compared to a layer-wise approach, for the same accuracy

    Cost-aware design and simulation of electrical energy systems

    Get PDF
    One fundamental dimension in the design of an electrical energy system (EES) is the economic analysis of the possible design alternatives, in order to ensure not just the maximization of the energy output but also the return on the investment and the possible profits. Since the energy output and the economic figures of merit are intertwined, for an accurate analysis it is necessary to analyze these two aspects of the problem concurrently, in order to define effective energy management policies. This paper achieves that objective by tracking and measuring the energy efficiency and the cost effectiveness in a single modular framework. The two aspects are modeled separately, through the definition of dedicated simulation layers governed by dedicated virtual buses that elaborate and manage the information and energy flows. Both layers are simulated concurrently within the same simulation infrastructure based on SystemC-AMS, so as to recreate at runtime the mutual influence of the two aspects, while allowing the use of different discrete time scales for the two layers. Thanks to the tight coupling provided by the single simulation engine, our method enables a quick estimation of various cost metrics (net costs, annualized costs, and profits) of any configuration of EES under design, via an informed exploration of the alternatives. To prove the effectiveness of this approach, we apply the proposed strategy to two EES case studies, we explored various management strategies and the presence of different types and numbers of power sources and energy storage devices in the EES. The analysis proved to allow the identification of the optimal profitable solutions, thereby improving the standard design and simulation flow of EES

    Improving PPG-based Heart-Rate Monitoring with Synthetically Generated Data

    Get PDF
    Improving the quality of heart-rate monitoring is the basis for a full-time assessment of people’s daily care. Recent state-of-the-art heart-rate monitoring algorithms exploit PPG and inertial data to efficiently estimate subjects’ beats-per-minute (BPM) directly on wearable devices. Despite the easy-recording of these signals (e.g., through commercial smartwatches), which makes this approach appealing, new challenges are arising. The first problem is fitting these algorithms into low-power memory-constrained MCUs. Further, the PPG signal usually has a low signal-to-noise ratio due to the presence of motion artifacts (MAs) arising from movements of subjects’ arms. In this work, we propose using synthetically generated data to improve the accuracy of PPG-based heart-rate tracking using deep neural networks without increasing the algorithm’s complexity. Using the TEMPONet network as baseline, we show that the HR tracking Mean Absolute Error (MAE) can be reduced from 5.28 to 4.86 BPM on PPGDalia dataset. Noteworthy, to do so, we only increase the training time, keeping the inference step unchanged. Consequently, the new and more accurate network can still fit the small memory of the GAP8 MCU, occupying 429 KB when quantized to 8bits

    Energy-efficient adaptive machine learning on IoT end-nodes with class-dependent confidence

    Get PDF
    Energy-efficient machine learning models that can run directly on edge devices are of great interest in IoT applications, as they can reduce network pressure and response latency, and improve privacy. An effective way to obtain energy-efficiency with small accuracy drops is to sequentially execute a set of increasingly complex models, early-stopping the procedure for 'easy' inputs that can be confidently classified by the smallest models. As a stopping criterion, current methods employ a single threshold on the output probabilities produced by each model. In this work, we show that such a criterion is sub-optimal for datasets that include classes of different complexity, and we demonstrate a more general approach based on per-classes thresholds. With experiments on a low-power end-node, we show that our method can significantly reduce the energy consumption compared to the single-threshold approach

    Predicting Hard Disk Failures in Data Centers Using Temporal Convolutional Neural Networks

    Get PDF
    In modern data centers, storage system failures are major contributors to downtimes and maintenance costs. Predicting these failures by collecting measurements from disks and analyzing them with machine learning techniques can effectively reduce their impact, enabling timely maintenance. While there is a vast literature on this subject, most approaches attempt to predict hard disk failures using either classic machine learning solutions, such as Random Forests (RFs) or deep Recurrent Neural Networks (RNNs). In this work, we address hard disk failure prediction using Temporal Convolutional Networks (TCNs), a novel type of deep neural network for time series analysis. Using a real-world dataset, we show that TCNs outperform both RFs and RNNs. Specifically, we can improve the Fault Detection Rate (FDR) of ≈ 7.5% (FDR = 89.1%) compared to the state-of-the-art, while simultaneously reducing the False Alarm Rate (FAR = 0.052%). Moreover, we explore the network architecture design space showing that TCNs are consistently superior to RNNs for a given model size and complexity and that even relatively small TCNs can reach satisfactory performance. All the codes to reproduce the results presented in this paper are available at https://github.com/ABurrello/tcn-hard-disk-failure-prediction

    A comparison analysis of ble-based algorithms for localization in industrial environments

    Get PDF
    Proximity beacons are small, low-power devices capable of transmitting information at a limited distance via Bluetooth low energy protocol. These beacons are typically used to broadcast small amounts of location-dependent data (e.g., advertisements) or to detect nearby objects. However, researchers have shown that beacons can also be used for indoor localization converting the received signal strength indication (RSSI) to distance information. In this work, we study the effectiveness of proximity beacons for accurately locating objects within a manufacturing plant by performing extensive experiments in a real industrial environment. To this purpose, we compare localization algorithms based either on trilateration or environment fingerprinting combined with a machine-learning based regressor (k-nearest neighbors, support-vector machines, or multi-layer perceptron). Each algorithm is analyzed in two different types of industrial environments. For each environment, various configurations are explored, where a configuration is characterized by the number of beacons per square meter and the density of fingerprint points. In addition, the fingerprinting approach is based on a preliminary site characterization; it may lead to location errors in the presence of environment variations (e.g., movements of large objects). For this reason, the robustness of fingerprinting algorithms against such variations is also assessed. Our results show that fingerprint solutions outperform trilateration, showing also a good resilience to environmental variations. Given the similar error obtained by all three fingerprint approaches, we conclude that k-NN is the preferable algorithm due to its simple deployment and low number of hyper-parameters
    • 

    corecore